An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding

نویسندگان

  • Shaun Mahony
  • Matthew D. Edwards
  • Esteban O. Mazzoni
  • Richard Sherwood
  • Akshay Kakumanu
  • Carolyn A. Morrison
  • Hynek Wichterle
  • David K. Gifford
چکیده

Regulatory proteins can bind to different sets of genomic targets in various cell types or conditions. To reliably characterize such condition-specific regulatory binding we introduce MultiGPS, an integrated machine learning approach for the analysis of multiple related ChIP-seq experiments. MultiGPS is based on a generalized Expectation Maximization framework that shares information across multiple experiments for binding event discovery. We demonstrate that our framework enables the simultaneous modeling of sparse condition-specific binding changes, sequence dependence, and replicate-specific noise sources. MultiGPS encourages consistency in reported binding event locations across multiple-condition ChIP-seq datasets and provides accurate estimation of ChIP enrichment levels at each event. MultiGPS's multi-experiment modeling approach thus provides a reliable platform for detecting differential binding enrichment across experimental conditions. We demonstrate the advantages of MultiGPS with an analysis of Cdx2 binding in three distinct developmental contexts. By accurately characterizing condition-specific Cdx2 binding, MultiGPS enables novel insight into the mechanistic basis of Cdx2 site selectivity. Specifically, the condition-specific Cdx2 sites characterized by MultiGPS are highly associated with pre-existing genomic context, suggesting that such sites are pre-determined by cell-specific regulatory architecture. However, MultiGPS-defined condition-independent sites are not predicted by pre-existing regulatory signals, suggesting that Cdx2 can bind to a subset of locations regardless of genomic environment. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2-5.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genome-wide analysis of CDX2 binding in intestinal epithelial cells (Caco-2).

The CDX2 transcription factor is known to play a crucial role in inhibiting proliferation, promoting differentiation and the expression of intestinal specific genes in intestinal cells. The overall effect of CDX2 in intestinal cells has previously been investigated in conditional knock-out mice, revealing a critical role of CDX2 in the formation of the normal intestinal identity. The identifica...

متن کامل

TherMos: Estimating protein–DNA binding energies from in vivo binding profiles

Accurately characterizing transcription factor (TF)-DNA affinity is a central goal of regulatory genomics. Although thermodynamics provides the most natural language for describing the continuous range of TF-DNA affinity, traditional motif discovery algorithms focus instead on classification paradigms that aim to discriminate 'bound' and 'unbound' sequences. Moreover, these algorithms do not di...

متن کامل

Comparative study on ChIP-seq data: normalization and binding pattern characterization

MOTIVATION Antibody-based Chromatin Immunoprecipitation assay followed by high-throughput sequencing technology (ChIP-seq) is a relatively new method to study the binding patterns of specific protein molecules over the entire genome. ChIP-seq technology allows scientist to get more comprehensive results in shorter time. Here, we present a non-linear normalization algorithm and a mixture modelin...

متن کامل

LASAGNA-Search: an integrated web tool for transcription factor binding site search and visualization.

The release of ChIP-seq data from the ENCyclopedia Of DNA Elements (ENCODE) and Model Organism ENCyclopedia Of DNA Elements (modENCODE) projects has significantly increased the amount of transcription factor (TF) binding affinity information available to researchers. However, scientists still routinely use TF binding site (TFBS) search tools to scan unannotated sequences for TFBSs, particularly...

متن کامل

Integrated analysis of transcript-level regulation of metabolism reveals disease-relevant nodes of the human metabolic network

Metabolic diseases and comorbidities represent an ever-growing epidemic where multiple cell types impact tissue homeostasis. Here, the link between the metabolic and gene regulatory networks was studied through experimental and computational analysis. Integrating gene regulation data with a human metabolic network prompted the establishment of an open-sourced web portal, IDARE (Integrated Data ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2014